Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XFA - Add a parser for XFA files #12879

Merged
merged 1 commit into from
Feb 2, 2021
Merged

Conversation

calixteman
Copy link
Contributor

  • the parser is base on a class extending XMLParserBase
  • it handle xml namespaces:
    • each namespace is assocated with a builder
    • builder builds nodes belonging to the namespace
    • when a node is inserted in the parent namespace compatibility is checked (if required)
  • to avoid name collision between xml names and object properties, use Symbol.

@calixteman
Copy link
Contributor Author

The objects in config.js or in xdp.js reflect what we've in the documentation, for example the acrobat element documentation is:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=1298&zoom=auto,-207,679
and the corresponding class:
https://github.com/mozilla/pdf.js/pull/12879/files#diff-0fde7327b01926a0a21537877b58611fa916d5cd228e33673678bc37565db84aR26

gulpfile.js Outdated Show resolved Hide resolved
this.element = null;
}

[$onChild](child) {
Copy link
Contributor

@timvandermeij timvandermeij Jan 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this syntax work? It doesn't look like a valid function name with these special characters in it; is this some kind of way to insert a variable as the function name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$onChild is defined in xfa_object like this: const $onChild = Symbol();
And the way to use this symbol as a function name is to use brackets:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Symbol#symbol_wrapper_objects_as_property_keys
The only way to call this function is to have the symbol so there's no access by name (as a string).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, learned something new then :-)

It's not entirely clear yet as to why this is needed exactly (i.e., why access by name causes conflicts here), but that requires a more in-depth look. This syntax was just something I noticed and didn't know; thanks for explaining this!

Copy link
Contributor Author

@calixteman calixteman Jan 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the specs there is something call SOM expressions to select nodes:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=101&zoom=auto,-207,766
These expressions seem to be used in different places.
And so in order to make easy their interpretation, each part of the expression can be searched as a own property of the object without the need to keep a flag somewhere to check if the property exists by spec or if it is an implementation detail.
And I guess it'll avoid any bad use an attacker could do with specific expressions.
So my idea was to have objects which are the exact reflection of specs in hiding implementations details and thx to that we can enjoy the object model stuff like overriding $onChildCheck method when a specific node can accept a node from another namespace.
So from my pov, it makes the implementation safer, simpler and reduce memory use (no need to track spec properties...).
And since it's unusual I added this $ symbols to help to see them.

src/core/xfa/xfa_object.js Outdated Show resolved Hide resolved
src/core/xfa/xfa_object.js Show resolved Hide resolved
 - the parser is base on a class extending XMLParserBase
 - it handle xml namespaces:
   * each namespace is assocated with a builder
   * builder builds nodes belonging to the namespace
   * when a node is inserted in the parent namespace compatibility is checked (if required)
 - to avoid name collision between xml names and object properties, use Symbol.
@brendandahl
Copy link
Contributor

/botio test

@pdfjsbot
Copy link

pdfjsbot commented Feb 1, 2021

From: Bot.io (Windows)


Received

Command cmd_test from @brendandahl received. Current queue size: 0

Live output at: http://3.101.106.178:8877/7363845babd30f9/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Feb 1, 2021

From: Bot.io (Linux m4)


Received

Command cmd_test from @brendandahl received. Current queue size: 0

Live output at: http://54.67.70.0:8877/73961cfd056ac56/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Feb 1, 2021

From: Bot.io (Linux m4)


Failed

Full output at http://54.67.70.0:8877/73961cfd056ac56/output.txt

Total script time: 26.86 mins

  • Font tests: Passed
  • Unit tests: FAILED
  • Integration Tests: Passed
  • Regression tests: FAILED

Image differences available at: http://54.67.70.0:8877/73961cfd056ac56/reftest-analyzer.html#web=eq.log

@pdfjsbot
Copy link

pdfjsbot commented Feb 1, 2021

From: Bot.io (Windows)


Failed

Full output at http://3.101.106.178:8877/7363845babd30f9/output.txt

Total script time: 28.46 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: Passed
  • Regression tests: FAILED

Image differences available at: http://3.101.106.178:8877/7363845babd30f9/reftest-analyzer.html#web=eq.log

@brendandahl brendandahl merged commit e16c99f into mozilla:master Feb 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants